The Bielefeld Jigsaw Map Game (JMG) Corpus

نویسندگان

  • Andy Lücking
  • Peter Menke
  • Olga Abramov
  • Alexander Mehler
چکیده

Spoken language still poses a challenge to mechanisms developed for information processing and retrieval. Applications in this area often require a large amount of annotated data, which is hardly obtainable for spoken language. We present a corpus of 78 semi-natural dialogues (length: ≈20 h) completely transcribed and annotated on various linguistic levels. The dialogues stem from a psycholinguistic, task-oriented coordination game, the Jigsaw Map Game (JMG) (Weiß, Pfeiffer, Schaffranietz, and Rickheit 2008). In order to solve the task of the JMG, the speakers produced spontaneous utterances about objects from a predefined object set, that have to be located according to a map. What makes this data special is the combination of fixed utterance topics and nonetheless unconstrained language use. Primarily developed and annotated to study alignment in communication (Pickering and Garrod 2004), the corpus represents a useful resource for natural language processing and studies on spoken language. We describe 1. the theoretical motivation of the JMG, 2. the experimental setting used for data gathering, 3. the levels of annotation and its correction, as well as 4. the applications of the JMG corpus in the area of lexical alignment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Application with the Multi-Touch Interactive Technology-A Study of Jigsaw Game

Multi-touch technology has become increasingly popular interactive technology of teaching content. The displays with multi-touch are responsive enough to support a wide variety of applications. In this study, we attempt to apply three different technologies of C#, Microsoft Surface and Windows Presentation Foundation developing an interactive jigsaw game. This game would design with the multi-t...

متن کامل

NGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map

Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...

متن کامل

The ALICO corpus: analysing the active listener

The Active Listening Corpus (ALICO) is a multimodal data set of spontaneous dyadic conversations in German with diverse speech and gestural annotations of both dialogue partners. The annotations consist of short feedback expression transcriptions with corresponding communicative function interpretations as well as segmentations of interpausal units, words, rhythmic prominence intervals and vowe...

متن کامل

Building a DDC-annotated Corpus from OAI Metadata

Document servers complying to the standards of the Open Archives Initiative (OAI) are rich, yet seldom exploited source of textual primary data for research fields in text mining, natural language processing or computational linguistics. We present a bilingual (English and German) text corpus consisting of bibliographic OAI records and the associated full texts. A particular added value is that...

متن کامل

Language Skill-Task Corollary: The Effect of Decision-Making vs. Jigsaw Tasks on Developing EFL Learners’ Listening and Speaking Abilities

Task-based language Teaching (TBLT) has occupied the pertinent literature for some long years. However, the role of specific task type in developing specific skill type seems to be amongst the intact issues in the literature. To shed more light on this issue, the present study was conducted to compare the effect of jigsaw and decision-making tasks on improving listening and speaking abilities o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011